perm filename RPT.RLL[RDG,DBL] blob sn#537643 filedate 1980-09-25 generic text, type T, neo UTF8
Report of the ESW Oil Spill Effort
RLL: A Representation Language Language

Doug Lenat and Russ Greiner



Unlike most groups,  we (Lenat and  Greiner) focused on  the entire  spill
crisis treatment scenario,  and paid  only slight extra  attention to  the
subproblems of Discovery  (initial intake interview)  and Source  Location
(by backtracking or by indirect  analysis).  In fact, we considered  OTHER
problems, such as locating an escaped convict (where the unwanted material
is spilled onto conduits (roads) and must be located, etc.)  Whenever  any
piece of knowledge was added to RLL, the question we invariably posed was:
can this be generalized  or abstracted in some  way, and still retain  its
potency, its power for constraining search?  Most of the knowledge we have
so far represented within RLL  is common to both  the convict and the  oil
spill problems, and  is represented in  a manner usable  by the system  in
either context.  Of course there are individual differences in technology,
such as road blocks instead of absorbent booms, but those differences  are
at much lower a  level (e.g., terminology)  than most inference  processes
deal with.  This kind of generality is  one of the major powers of RLL  --
and, due  to the  effort required  to  exercise it,  one of  the  greatest
liabilities when a constraint is to have a running system in two days.  As
we hope RLL's mechanisms will eventually be widely used, we are attempting
to enter the information -- whether  data, or constrol structure -- in  as
unbiased and  extensible a  manner  as possible.   We chose  to  sacrifice
"performing a flashy demo" for "representing things the right way"; toward
the end,  we had  to sacrifice  both of  them to  get even  a meager  demo
running.

One of the early  exercises we performed was  to hand-simulate a  dialogue
with the system.  It became  clear that we would  have to choose a  "role"
for the system to play.   We noted that the  greatest need was during  the
night, when nightshift workers who  were ill-equipped to deal with  spills
nevertheless had to. Thus, our model is one where a spill is  encountered,
called in to the  program, and the latter  then directs the activities  of
the discoverer, sends out other teams, notifies various authorities,  etc.
Thus the role is one of REPLACING the expert in this process.  We  believe
that almost all the  information can, however, also  be used for  tutorial
purposes,  for  advising  an  expert,   etc.   This  is  one  reason   for
representing each piece  of knowledge EXPLICITLY,  rather than burying  it
within a piece of code.

As our simulation continued, we observed  that there was frequent need  to
"suspend" one of  the major  tasks we  had begun,  to attend  to some  new
datum, some new conclusion with dramatic consequences ("Don't breathe that
stuff!"), or simply because the current task seemed to be bogging down.

The control structure which this type of interaction suggested (to us)  is
an AGENDA of tasks, very much like the agenda of AM. Each task would  have
some priority rating, and when selected would fire production-like  rules,
until it was satisfied or until its quantum of cpu time expired (in  which
case it would be suspended).  During the firing of a rule, it could direct
the rpogram to add new tasks to the agenda, modify the data base, ask  the
user for some information, tell him some, etc.

While we have worked on RLL for some time, we had not (until this  probem)
implemented this type of control structure; hence our first major task was
to describe it to RLL.  (This meant encoding it  as a collection of  units
including rules, tasks, priorities, special values returnable by rules and
by tasks, etc.)  These new control-related units were entered into one  of
our permanent system knowledge bases (EURISKO), rather than on the new one
we had created for this task (SPILL), because of the future utility of the
agenda mechanism.

The second "lack" we felt in the then-extant RLL system was the notion  of
gradual restriction (corresponding  to the SPEC  relation, defined in  the
MOLGEN UNITS package  [Stefik]).  In  particular, we needed  to deal  with
generic events, whose descendants could become gradually more specialized,
instantiated, particularized.  We  added the units  for events in  general
and  pipe  breaks,  flows,  etc.  in  particular.   We  also  added  units
describing the type of  gradual restriction we  wanted to have  connecting
events.  We  represented  several  kinds of  connections  between  events,
several  kinds  of  slots  that  were  new  to  RLL:    MoreGeneralEvents,
CausesOtherEvents,    CausedBy,    PriorEvents,     MoreSpecializedEvents,
LaterEvents, SimultEvents, etc.

The third thing we noticed  was that RLL had no  notion of a Problem.   It
has previously been used  only on open-ended types  of tasks, never  those
admitting a precise answer or solution.   Units for these concepts had  to
be added.

Finally, we began to enter units for concepts which had at least SOMETHING
to do with  the target  task: liquids, chemicals  (and oils  and acids  in
particular), pH,  flows,  containers,  mixings, etc.   At  this  level  of
abstraction, none of this was specific to the particular problem given.

The incorporation of  the above  units took  two days  of part-time  work;
probably 25 man-hours in  all. (Much additional time  was spent fixing  up
RLL: In addition to fleshing out many skeletons, like the agenda mechanism
mentioned above, there  were a  host of  low level  bugs which  had to  be
fixed.)  Before this task was completed, we had sketched out how we  would
represent such  task-specific  details as  the  White Oak  Creek  drainage
system, the four  major pieces  of legislation which  define the  possible
violations, the particular counter-measures which can be taken to halt the
flow of oil or acid, etc.  Units for some of these have been entered.  The
final type of problem-specific knowledge which we had to enter, to get RLL
"running" was the  set of  rules which manage  the various  phases of  the
spill   crisis   management   problem.    These   ranged   from    trivial
information-requesting rules (If  the discoverer's name  isn't known,  ask
it) to judgmental rules for counter-measures (If the flow is to be stopped
at a Weir, then use a skimmer). Not all of these have been added, and as a
result the "demos" produced by the system are incomplete. Essentially,  we
began entering task rules on Tuesday night -- into a system which was only
then at the stage most other groups had on Saturday.  Because most of  the
preliminary knowledge  was represented  in a  reasonable way,  it will  be
usable in the  future.  It  is important to  realize that  RLL itself  was
alterred.  (We  are NOT  including the  removal of  various bugs  in  this
category.)  In addition to the SPILL-related specific facts just enterred,
RLL  now  better   understands  agndae,  generic   objects,  and   control
mechanisms.  As these will remanin in RLL, it will be considerably  easier
to implement subsequent applications which are "close" to this one.

The details of our small implementation  can best be apprehended from  the
figures, traces,  knowledge bases,  etc.  which accompany  this  document.
Some simple  consultations (dialogues)  have been  run through,  including
directing the  user in  a backtrack  search to  locate the  source of  the
spill.

Note in  particular  the  manner  in  which  one  task  starts  (interview
discoverer) but  spends only  a few  seconds  on it.   Some of  the  rules
associated with acheiving that task are fired (getting the spill type  and
location), but many are not (getting the discoverer's department address).
Of higher priority is a  preliminary identification of the material  which
has  spilled,   and  so   the  Discovery   task  is   suspended  and   the
material-characterization task is chosen to  run.  After a preliminary  ID
is made (oil,  acid, perhaps one  level more detail,  but NOT the  precise
chemical composition or  trade name of  it), that task  too is  suspended.
The highest priority then is  Evaluating potential hazards.  Thus,  within
about 10cpu seconds, RLL has formed  a tentative picture of what  spilled,
where, and how dangerous it is.   Gradually, that picture is fleshed  out,
as more tasks are executed, and as suspended tasks are resumed and  worked
on some more.  The power of the agenda is in allowing any "high  priority"
rule to trigger at almost any time.

The versatility and adaptability of  this agenda mechanism, together  with
later general utility of  the knowledge, are the  major strengths of  this
implementation.  Similar  flexibility can  be found  in the  RLL  language
itself.  To understand its mallability, one  has to consider the range  of
things which the RLL user  may regard as "parameters"  -- i.e. what he  is
allowed to specify, as opposed to finding hardwired in.

Each of the expert building systems has a different idea of what qualifies
as domain specific information (that is, what the user should be  expected
to enter).   For example,  none  of these  ESBSs (expert  system  building
systems) would  be expected  to  know, a  priori,  the specifics  of  this
particular plant, such as  "Pipe90 connects to  Pipe82" or "All  permanent
storage tanks are  diked".  Similarly,  none of these  systems would  have
facts at one  higher level  -- for example,  information about  chemistry,
(e.g. Oil#33 is corrosive) or connectivity, (e.g. that each pipe will flow
into some other pipe, unless it  leaks) - built in.  As such,  information
in both categories would have to be enterred.

RLL goes one step  further, by allowing the  user to specify what  control
regime to  use as  well.  This  does NOT  imply this  information must  be
enterred in  LISP code,  anymore  than the  other facts  (pertaining,  for
example, to acids or  dikes,) had to  be given in so  low level a  manner.
RLL first includes a set of known mechanisms, (eg BackWard Chaining Rules,
or Agenda), from which the user may conveniently select the one he wishes.
In addition, RLL provides a collection of tools, which the user can use to
construct his own new control regimes, if necessary.  These tools describe
the control information in high level, natural terms.

As for the weaknesses, one of the  most obvious ones is the extra cost  of
getting this system running: we  can't assert that Pipe3 flows-into  Pipe4
without first creating a unit for the relation flows-into, explaining that
that isa Slot,  that it is  meaningful for any  two conduits, etc.,  etc.,
etc. One thing that  might be expected  to be a  weakness is the  apparent
inefficiency this  high  degree  of "interpretiveness"  implies.   To  the
contrary, this is one of RLL's big strengths: see [Lenat, Hayes-Roth,  and
Klahr] for  details of  how  caching and  other techniques  recapture  the
efficiency that would otherwise be  lost.  Admittedly, the FIRST time  you
ask RLL to do something, it takes a LONG time, but from then on a  similar
type of request will  return fairly quickly.  One  severe weakness is  the
absence of a front-end; the user  must build his system by editing  units,
rather than the nice  human-engineered dialogue he  can have with  EMYCIN,
e.g.  The final SYSTEM produced, however, can have a simple user interface
(and in fact this is one reason we  had the ROLE of our system be that  of
the expert -- it could have the initiative almost exclusively, and  simply
ask questions of the user).

In this experiment,  we have been  forced to the  realization that, for  a
small amount of time, a simpler language (such as EMYCIN or LISP) is  able
to achieve SOME  results more quickly.   Some of the  goals of RLL,  which
include its aiding of the user in producing an expert system, just  aren't
in existence yet.   The experience  has also  reinforced our  view of  the
process of  building  an  expert  system as  an  incremental  approach  to
competence.  Innumerable times, compromises have to be made, sacrifices of
"the right way"  to the  altar of "getting  started".  We  have honed  our
abilities to make such sacrifices (one of the requisites of a C.K.E.), and
have honed our facilities  to make them  in a way  that does not  preclude
redoing things in a  better way later (another  CKE requisite).  To  close
with one example of this process,  our original design had four  Violation
rules, one for each piece of legislation; later, as we learned more  about
the complexities  of  those  regulations, we  realized  the  necessity  of
replacing those four  rules with four  separate tasks, each  of which  had
several rules attached.   This kind  of flexibility,  which is  admittedly
just beginning in RLL, is the cornerstone of successful KEing.